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SIMPLIFICATION OF COMPLEXES FOR PERSISTENT 
HOMOLOGY COMPUTATIONS. 

PAWEL DLOTKO AND HUBERT WAGNER 



<^^ Abstract. In this paper we focus on preprocessing for persistent 

homology computations. We adapt some techniques which were 
?-h successfully used for standard homology computations. The main 

idea is to reduce the complex prior to generating its boundary ma- 
trix, which is costly to store and process. We discuss the following 
f^ reduction methods: elementary collapses, coreductions (as defined 

CO by Mrozek and Batko) and acyclic subspace method (introduced 

by Mrozek, Pilarczyk and Zelazna). 

H 
O 

c3 1. Introduction 



Persistent homology starts being applied to a wide range of different 
practical problems, ranging from sensor networks to root architecture 
analysis. Performance, however, still tends to be a problem. In the 
following paper efficient preprocessing algorithms are proposed. In the 
case of standard (i.e. non-persistent) homology the following prepro- 
00 cessing techniques were successfully used: elementary collapses [16], 

^ coreductions [H] and the acyclic subspace method [12] . The basic idea 

O behind these techniques is to remove a number of cells, without affect- 

ing the homology (or affecting it in a controlled way). Importantly, 
these methods work on raw data, that is before the boundary matrix 
• rH is produced. 

rS On a practical note, the memory overhead of storing the boundary 

matrix is significant. For example, a 3-dimensional image of size 2000 3 
voxels with 16b gray-scale values takes roughly 16GB, while its bound- 
ary matrix roughly 200GB. To handle data of such sizes, it is crucial 
to pre-process before generating the boundary matrix. 

The goal of this paper is to extend the mentioned preprocessing 
methods (elementary collapses, coreductions and acyclic subspace) to 
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persistent homology. The techniques presented in this paper are heuris- 
tics, meaning there are no provable bounds on how many cells are re- 
duced. They have however performed well in practical situations as 
described in [7j, [UJ and [12]. The computational complexity of the 
presented techniques is linear in the number of cells provided the num- 
ber of neighbors of every cell in the complex is 0(1) (this assumption 
is true for cubical and simplicial complexes which are used in practice) . 
Therefore they can be used as an inexpensive preprocessing step. 

2. Background 

2.1. Chain complexes and homology. The presented methods work 
for arbitrary chain complexes with field coefficients. In the most typi- 
cal case this chain complex comes from a CW-decomposition of a given 
space. In practice, simplicial and cubical complexes are used. For 
simplicity, we use Z 2 coefficients throughout the paper, as this is the 
standard setting for persistence. However, the presented algorithms 
works for any field coefficients. Intuitively, in this setting, a chain com- 
plex can be viewed as a set of abstract cells (e.g. cubes or simplices) 
connected by a boundary operator as specified below. 

Let us fix a chain complex /C. Let a p-chain be a formal sum of 
p-cells with the Z 2 coefficients. The p-chains of /C, together with ad- 
dition modulo 2, form a group of p- chains, denoted by C P {K,). The 
boundary operator d p maps p-chains into (p — l)-chains, called faces. 
The definition of chain complexes requires that d p o d p+ i = 0. 

The chain of (co-)faces is called a (co-) boundary. For any p-chain 
c = ^OjCj, we have d p c = J2 a i^p c i- If a (p — l)-cell a has a p-cell 
B in its coboundary, we say a is a face of B, and B is a coface of a. 
(Notation: capital letters denote higher dimensional cell where a cell 
and its face is considered). Let us have a set A C /C. By the boundary 
of A we mean bd A = {b G /C|3 ae _4& G da}. We say that A is closed if 
bd A C A. 

To define homology let us first introduce the group of p-cycles, Z P (JC) = 
kerd p and its subgroup: the group of p-boundaries, B P {K) = imd p+ i. 
The p-th homology group is the quotient H p (JC) = Z p (JC)/ B P {1C). The 
p-th Betti number, denoted by /3 P (/C), is the rank of this group. 

2.2. Filtrations and persistence. Given a complex /C and a filtering 
function g : K, —¥ Z, persistent homology studies homological changes 
of the sub-level complexes, K t = g~ x (— 00, t]. We require that g(a) < 
g(B) whenever a is a face of B which implies that for every t G Z, Kt 
forms a complex i.e. the boundary of every cell in K t is contained in 
K t (we call this sublev el- complex filtration). Although in general the 
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filtration has values in Z sometimes to simplify the notation we assume 
that filtration starts from zero. Every filtration on a finite cell complex 
can be transformed to this one by a monotone change of coordinates, 
therefore this extra assumption does not limit the generality of the 
presented approach. 

We say that g is filtration by maxima if for every A £ K,, g{A) = 
max{g(bi), . . . , g{b n )} where bi, . . . ,b n are the faces of A. Such a fil- 
tration is often used when the function values are given only on the 
vertices and have to be extended to higher-dimensional cells. All the 
presented methods works for general nitrations, but this type of filtra- 
tion is convenient for examples. 

Persistent homology captures the birth and death times of homology 
classes of the sub-level complexes, as t grows from — oo to +00. By 
birth, we mean that a homology class is created; by death, we mean 
it either becomes trivial or becomes identical to some other class born 
earlier. The persistence, or lifetime of a class, is the difference between 
the death and birth times. Often a multiset of persistence intervals is 
used to represent persistence in a given dimension. A single interval 
encodes the lifetime of a homology class. We say that two spaces 
have same persistence, if their persistence intervals are the same in 
the corresponding dimensions. We disregard zero-length persistence 
intervals, because they carry virtually no information. 

The formal definition is as follows (after [5]): The p-th persistent 
homology groups of filtered complex /C are the images of the homo- 
morphisms induced by inclusion, H % ^(JC) = imH{f l,: '), where / lJ : 
/Cj <^-> fCj for every i,j 6 Z, i < j. By H P {JC) we denote the persistence 
homology of a filtered cell complex /C. So-called persistence diagrams 
encode the entire information about the persistence of a filtered com- 
plex. For more details see [5]. 

We want to remind a theorem saying when the persistence of two 
filtered complexes are equal: 

Theorem 2.1 (Persistence equivalence theorem, [5]). Consider persis- 
tent homology of two filtered complexes X and Y . Let (pi : H*(Xi) — > 
H*{Yi): 

...H,(X ) ► H*(X X ) - ... ->• H^X n ^) ► H*(X n )... 

...H*(Y ) ► H*(Y X ) ... ► #*(Y n _i) ► H*(Y n )... 

If the <pi are isomorphisms and all the squares commute, then the 
persistent homology of X and Y is the same. 
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The standard algorithm to compute persistence homology is a matrix 
reduction algorithm presented in [5] . The other algorithms to compute 



persistence are discussed in the Section |2.3| The output of the persis- 
tent homology computations is a list of persistence pairs of the form 
(birth, death). 

An important justification for the usage of persistence is the stability 
theorem. Cohen-Steiner et al. [3] proved that for any two filtering 
functions g and h the so called bottleneck distance (ds, see [3]) between 
the persistence of K with respect to / (denoted as H P (JC, /)) and the 
persistence of /C wrt. g (H p (K,,g)) is upper bounded by the L°° norm 
of the difference between / and g: 

(1) d B {H^{lCJ),H p {K,,g))<\\f-g\\ 00 ^mMf{x)-g{x)\. 

Simply put, small changes in the filtration values cause small changes 
in persistence. This enables robust estimation of how persistence is 
affected by perturbation of the input (e.g. noise). 



2.3. Existing algorithms and their complexity. Applying the matrix- 
reduction algorithm described in [5J to the boundary matrix of the 
input complex is the standard way to compute persistent homology 
groups. It works for general complexes in arbitrary dimensions. The 
worst-case complexity is 0(n 3 ), where n is the size of the input complex. 
Milosavljevic et al. [9] showed that persistent homology can be com- 
puted in matrix multiplication time 0{n' JJ ) where the currently best 
estimation of u is 2.373. Chen and Kerber [TJ proposed a random- 
ized algorithm to compute only pairs with persistence above a chosen 
threshold. Despite improving the theoretical complexity, it is unclear 
whether these methods are better in practice. 

When focusing on O-dimensional homology, union-find data struc- 
tures can be used to compute O-dimensional persistence in time 0(na(n)) 
[3] , where a is the inverse of the Ackermann function and n the size of 
the input. 

A recent variation of the standard algorithm, introduced by Chen 
and Kerber [2] significantly reduces the amount of computations. This 
idea was also used by Wagner et al. [15] to compute persistence for 
n— dimensional images. 

One can also compute persistence by computing homology of inclu- 
sion map as discussed in [13] and [T7] . This approach gives even reacher 
information than persistence. It is also very time consuming as noted 
in ©. 
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Another class of methods involves preprocessing the input complex 
using discrete Morse theory, as proposed by Robins [H]. Such a pre- 
processing significantly reduces the size of the boundary matrix, while 
preserving persistence. In the case of 3D grayscale images, an effi- 
cient parallel implementation was proposed in [6], allowing for han- 
dling large (~ 1200 3 ) images on commodity hardware. The standard 
matrix-reduction algorithm is used in the final step of computations. 

The approach by Robins works for arbitrary complexes and in di- 
mension three the preprocessing results in the smallest possible bound- 
ary matrix [2J (counting the number of rows/columns). The algorithm 
used in [2J depends crucially on simple-homotopy theory, which makes 
it hard to directly generalize the optimality result to higher dimensions. 
A recent paper by Mischaikow et al. [10J proposes a handy theoretical 
framework, where discrete Morse theory is extended from complexes to 
nitrations. 

It should be noted that the simplification methods based on discrete 
Morse theory aim at optimizing the number of cells. The total size of 
the resulting complex can in fact grow, since the number of connections 
can grow even quadratically in the number of cells. In other words, even 
if the initial boundary matrix is sparse, the matrix of the simplified 
complex can be dense. The methods presented in this paper do not have 
this property. Also, they can be used prior to persistence computations 
with discrete Morse theory. 

3. Elementary collapses. 

Elementary collapse is an old concept introduced by Whitehead in 
a context of simple homotopy types in [IB] . It is often used in the 
context of homology theory. It finds a pair of cells (A, b) G /C x /C 
(called elementary collapse pair) such that b has only one coface A in 
K (in this case b is called a free face). Removing such a pair does 
not change homology or homotopy type of /C, since such a removal 
corresponds to a deformation retraction. 

In this section we show that the elementary collapses can be used also 
in the case of persistent homology. However, the elementary collapse 
pair (A, b) must fulfill two extra assumptions: 

(1) b is a free face for every K. n , n e Z. 
{2)g{A) = g(b). 

The first condition indicates that at every filtration level (A, b) is a 
elementary collapse pair. It suffices to check the size of coboundary of 
b in the final complex in the filtration. The second condition indicates 
that filtration levels of A and b need to be equal. In Figure [T] we 
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show that these are in fact necessary conditions in order to preserve 
persistent homology information. 



1 i °- 



2 o_ 



Figure 1. On the left the complex consisting of a single 
cube is depicted. All the vertices and edges have filtra- 
tion level 0, while the two dimensional cube has filtration 
level 10. Consequently in the case of one-dimensional 
persistent homology an interval [0, 10] appears. However 
when any elementary collapse is performed, the inter- 
val does not appear anymore. Therefore the assumption 
g(A) = g(b) is necessary. On the right picture, if an 
elementary collapse is performed at level as indicated 
by the arrow, at sub-level 10 we are missing the shared 
edge and the 2-cell having value 0, so the homology (and 
consequently persistence) is changed. 

Let us now prove that the presented extra assumptions guarantee 
that after removing a pair (A, b), persistent homology of a complex /C 
does not change. 

Theorem 3.1. Let (A,b) G /C x /C be a elementary collapse pair and b 
be a free face in K n for every n G Z. Moreover let g(A) = g(b). Then 
HP{K.) = HS{lC\{A,b}). 



Proof. The proof bases on Theorem 2A_ Let us consider maps in ho- 
mology induced by inclusions K\ C JC2 C . . . C )C n before and after 
removal of a pair (A, b): 



g(ij-i) 



#*(/Q 



H(jl) 



fffo-l) 



ff(<i). 



H(fcj) 



H m {K 



H(ii+i\ 



l+lj 



HUi+i) 



> H.(1b\{A,b}) -^ H*(1C l+l \{A,b}) 



H(k l+1 ) 



The horizontal maps are the maps induced in homology by inclusion 
map i\ : /Q ^-> /Q+i and ki : /Cj \ {A, b} M- /C/ + i \ {A, b}. The vertical 
maps ji : H*(fCi) — > H^{1Ci \ {A, b}) are isomorphisms, since a pair 
(A, b) is elementary collapse pair in every /Q and g(A) = gib). 
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To show that all squares commute, let us pick a chain c G C7(/Q). 
Then j/(c) = c—{c, A) A — (c, b)b i.e. the generators A and b are removed 
form the chain c. We see that ki(ji(c)) is simply ji(c) G C(/Q+i\{A, 6}). 
Let us now compute ji + i(ii(c)). We have z/(c) = c G C(/C/ + i). To 
obtain jj+i(ij(c)) it again suffices to remove from ii(c) the generators A 
and 6, which is j; + i(^(c)) = c—(c,A)A — {c,b)b. Therefore ji + x(ii(c)) = 
ki(ji(c)) so all the squares commute. 



Therefore due to Theorem 2.1 persistence modules defined by lower 
and upper sequence of complexes are equal, which completes the proof. 

□ 

It might seem that there could be just a few elementary collapse 
pairs to remove in the external boundary of the considered complex. 
However the reduction process usually creates new collapse pairs and 
it can be continued. 

The idea of the described procedure is summarized in Algorithm [T] 

Algorithm 1 Elementary collapses for persistence. 

Input: K, - cell complex with filtration g : K, — >• Z; 
Output: Reduced complex K! with the same persistence as /C. 
K! = K 

List of cells Q= 0; 
for every cell b G KJ do 

if b has unique cell A in coboundary and gib) = g{A) then 
Q.enqueue(b); 
while Q is not empty do 
b = Q.dequeueQ; 

if b has unique cell A in coboundary and g(b) = g(A) then 
K' = K! \ {A, 6}; 

for every element c in boundary of A or boundary of b do 
if c has unique cell D in coboundary and g(c) = g(D) then 
Q.enqueue(c); 



4. COREDUCTIONS. 

The concept of coreductions was introduced by Mrozek and Batko 
in [H] . It is based on an idea to search for an homologically trivial sub- 
complex in the given complex starting from lowest possible dimension 
(bottom-up). The approach to find acyclic subcomplexes that uses only 
top dimensional cells are shown in Section [5j The formal presentation 
requires the concept of S-complex, which is a chain complex with a 
fixed basis. This concept is not recalled here for brevity's sake. For 
further details consult [TT] . 
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Pair of cells (A, b) is a coreduction pair if b is the unique element 
in boundary of A. Such a pair cannot exist in a simplicial or cubical 
complex. Therefore in [TTj first a single vertex is removed from each 
connected component of the considered complex. This removal changes 
only the zero dimensional homology. Now, the process of coreduction 
is iterated as long as one can find a coreduction pair in the considered 
complex. 

The coreduction algorithm was already used to compute homology 
of inclusions and persistent homology in [TS]. However the approach 
there was different - in [13] the coreduction was used to remove as 
many elements as possible before computing map in homology induced 
by inclusion map between levelsets. To be precise: at each level of 
filtration the homology generators were computed and then, by solving 
system of linear equations one obtained the matrix mapping generators 
at nth level to generators at (n + l)th level. This procedure is more 
general than computing persistence. As indicated in [J3] this strategy 
is in general not efficient for computing only persistence. Here we use 
coreduction algorithm as a preprocessing for the standard algorithm to 
compute persistence, as presented in [5]. 

First we show that removing a coreduction pair (A, b) such that 
g{A) = g(b) does not change the persistent homology. Then we show 
that removing a vertex changes only O-dimensional persistence. 

Since Corollary 4.2. from [IT] is extensively used in this section, we 
recall it here: 



Theorem 4.1 (Corollary 4.2 |llj). Let K. be a chain complex (without 
filtration). If {A,b) is a coreduction pair in JC, then H(K.) and H{K \ 
{A, b}) are isomorphic. 



Now we are ready to present the main theorem of this section, anal- 
ogous to Threorem |3.1| 



Theorem 4.2. Let (A, b) e /C x /C be a coreduction pair. Moreover let 
g{A) = g(b). Then #*>(£) = H'{1C \ {A,b}). 



Proof. First we point out that if (A, b) is a coreduction pair and g(A) = 
g(b), then it is a coreduction pair in every filtration level in which A 
and b exist. Let us consider the filtered complex /C before and after 
removal of a coreduction pair (A, b): 
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g(»i-i) 



g(fcj-i) 



#*(£* 



fftfl) 



fl,(/c,\{4&}) 



H(<0. 



H(fc,) 



fl*(/C 



l+lj 



H(ji+i) 



H.(K l+1 \{A,b}) 



H(ii+i) 



H(k l+1 ) 



The horizontal maps %i : K\ <-)■ /Q+i and ki : JCi\ {A, b} ■— >• /Q+i \ 
{A, 6} are inclusion maps. The vertical maps H(ji) : H*(/Cj) — >• H^{K,{\ 
{A,b}) are isomorphism due to the Theorem 4.1 and the fact that 
{A, b} is a coreduction pair in both /Q and /Q+i and g"(A) = #(&). 



Analogously as in Theorem 3.1 one can show that all the squares com- 
mute. Therefore, due to Theorem 2.1 persistence defined by lower and 



upper sequence of complexes is equal, which completes the proof. 



□ 



It remains to show that we can remove a single vertex V from the 
complex, and this changes only the zero dimensional persistent homol- 
ogy. The fact that higher dimensional persistence is not affected follows 
from the definition of reduced homology and persistence. To be precise 
- let us remind the augmented chain complex in the setting of reduced 
homology: 



^> C n {K) ± 
where e(]T\ a^) = J2 



<-)-, 



> Ci(/C) 
e/C. 



di, 



C (/C) 4Z 2 ^0 



a.i for ti 

In this setting, removal of the initial vertex V can be interpreted as 
a removal of a coreduction pair (V, 0), where is the unique generator 
of Z2 in dimension -1. From the properties of the reduced homology it 
follows that removing V changes only the zero dimensional homology 
(and persistence). 

In the standard homology, once the coreduction pairs are removed 
as long as possible, all vertices in a given connected component are 
removed. The example in Figure [2] show that in the case of persistent 
homology this does not hold: Even if we start from the vertex with the 
lowest filtration value and perform the coreductions as long as possible, 
some vertices may remain in the considered connected component. 

As it can be seen at Figure [3] the whole information about zero- 
dimensional persistence is lost when the coreductions are done. How- 
ever it is fairly easy to compute zero dimensional persistence in near 
linear time from the original complex by using find-union data struc- 
tures [5]. 

As it can be seen in Figure El removing a coreduction pair (A, b) 
such that g(A) > gib) does change the persistent homology of the 
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Figure 2. One dimensional complex, filtered with max- 
ima. Numbers indicate the filtration values. Initially 
the vertex having value is removed. Arrows indicate a 
pairings done by coreductions. This example shows that 
unlike the case of standard homology, not all vertices 
form the connected component can be removed during 
coreductions. 




Figure 3. An example showing that when the coreduc- 
tions are done the information about zero dimensional 
persistence is lost. For the sake of clarity the filtration 
is simply a height function (it is not depicted with num- 
bers). We assume the filtration of an edge is the maxi- 
mal filtration level of its vertices. The coreductions are 
started by removing left-bottom vertex and are indicated 
by arrows. It is easy to see that in the coreduced com- 
plex vertices A, B and C generates infinite persistence 
intervals in zero dimensional persistent homology. Those 
infinite intervals are not present in persistent homology 
of the initial complex. 



given complex. Therefore the assumptions presented in Theorem 4J2 
are crucial. 

To summarize this section in Algorithm [2] the coreductions for per- 
sistence are presented. 
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Figure 4. Simple demonstration showing that condi- 
tion g(A) = g(b) is necessary when removing coreduction 
pair (A,b). Suppose a coreduction pair (A, b) presented 
in the picture is removed (the vertex on the left and the 
edges paired with remaining vertices are already removed 
from the complex). It is then clear that once pair (A, b) 
is removed an interval [0, 10] is lost from one dimensional 
persistence. 

Algorithm 2 Coreductions for persistence. 

Input: /C - cell complex with filtration g : K, — >• Z; 

Output: Reduced complex K! with the same persistence for dimensions > 1 
as /C. 
£' = £; 

List of cells Q= 0; 
for every connected component of KJ do 

Remove a single vertex V from considered connected component of /C'; 
for every cell B G K! do 

if B has unique cell o in boundary and g(B) = g(a) then 
Q .enqueue(B); 
while Q is not empty do 
B = Q.dequeueQ; 

if B has unique cell a in boundary and g(B) = g(a) then 
K' = JC'\{B,a}) 

for every element C in coboundary of a or coboundary of B do 
if C has unique cell d in boundary and g(C) = g(d) then 
Q .enqueue(C); 



5. Acyclic subspace. 

The acyclic subspace method follows from the excision property. In 
standard homology theory it states that for a complex /C and a subcom- 
plex A of K such that H n (A) = forn > and Hq(A) = Z n , where n is 
the number of connected components of /C, we have H n {fC) = H n {fC, A) 
for n > 0. Since A is closed we can use the following theorem from p 
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Theorem 5.1 (Theorem 3.1 [H]). If A is closed in fC, then H*{1C \ 
A) = H*{K,A). 

Therefore it suffices to compute homology of a chain complex ob- 
tained by removing all elements in A from /C. If one is able to efficiently 
find large acyclic subcomplex A, then this approach offers great per- 
formance gains. For instance this method has been used to speed up 
cubical homology computations in [12] and to efficient computation of 
cohomology generators in [I]. A similar technique has been used to 
speed up computations of homology of inclusions [T7j . 

The presented strategy can be especially useful in the case of cubi- 
cal data, for example 3D images. In this case we work only on top 
dimensional cells. This is important, as the number of faces (of all di- 
mensions) of a given cell is exponential in its dimension. Even though 
this method is limited to low dimensions, the performance gains can 
be significant. 

In this section we demonstrate how this technique can be used to 
speed up computations of persistent homology. To do this, we need to 
introduce the concept of acyclic subcomplex compatible with filtration. 
Let A be a subcomplex of a filtered complex KLq C . . . C KL n — K. 
We say that A is an acyclic subcomplex compatible with filtration if 
Ai — Ki fl A is an acyclic subcomplex in /Q for every i e {0, . . . , n}. 

In order to obtain acyclic subset compatible with filtration, first the 
maximal acyclic subcomplex Aq is constructed in /C as described for 
instance in [12 R Then elements in /Ci are processed to construct A\. 
The element B 6 K,\ is added to to the complex iff: 

(1) The intersection with the acyclic complex constructed so far is 
acyclic and 

(2) there are no elements in boundary of B intersected with /Co that 
are not in acyclic subcomplex Aq. 

For the higher values of filtration we use analogous criterion. The 
point (J2J) for g(B) = i should be then replaced with: 

(1) there are no elements in boundary of B intersected with /C,- for 
j < i that are not in acyclic subcomplex Ai~\. 

In this way we ensure that the closure of the final acyclic subcomplex 
intersected with every level of filtration is acyclic subcomplex at this 



The idea of the procedure is as follow - first a top dimensional cell A in /Co is 
chosen. Then all its neighbor elements in /Co are processed. Element B e /Co is 
added to the acyclic subcomplex iff its intersection with the current acyclic sub- 
complex is acyclic. For fast tests tabulated configurations for cubes and simplices 
can be used. 
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level of filtration as shown in Theorem 5J2 This condition is necessary 
as explained in Figure [5j This fact is used in the proof of Theorem 5.3 



Theorem 5.2. As a result of the presented procedure an acyclic sub- 
complex compatible with filtration is obtained. 

Proof. Due to (fTl) the obtained complex is acyclic. From point (J2j) we 
have that JCi\A = JCi\Ai- Therefore we obtain acyclic subcomplex 
compatible with filtration. □ 





















1 
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Figure 5. At the first filtration level the 2-dimensional 
elements with filtration (marked with darker gray) be- 
long to the acyclic subcomplex. If, as in [12], we con- 
sidered only intersection with acyclic subcomplex when 
generating the acyclic subcomplex at the filtration level 
1, the middle 2-dimensional element would be added to 
the acyclic subset at filtration level 1. In the end, when 
the whole (closed) acyclic subspace is removed, we lose 
the [0, 1] persistence interval in dimension one. This can- 
not happen when the extra restrictions are imposed. 



Let us show that such a procedure do not change persistence in 
dimension greater of equal 1 . 

Theorem 5.3. Let /Co C . . . C K, n = /C be a filtered complex and let 
Ao C . . . C An = A, such that At = JCi \ A, be an acyclic subcom- 
plex compatible with filtration. Then persistent homologies Hf(K.) and 
Hf(IC \ A) are the same for I > 0. 
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Proof. Again let us write the following diagram: 



H(ji) 



H(ji+i) 



^K H.fcXAi) ^> H^ l+1 \A +1 ) ^4 



The horizontal arrows are induced by inclusion. While it is straight- 
forward for the upper sequence, the lower one require some explanation, 
since in general both /Q C /Q+i and A\ C A-i+i- The ki is an inclusion, 
since Ai+i \ Ai C JCi+i \ /Q. 

From the exact sequence of a pair (JCi,Ai): 

-> H n (Ai) -)• #„(£,) -► #„(/Q,^) -> 

since if n (^4/) = for n > we have that H n (fCi) is isomorphic to 



H n (ICi,Ai). From Theorem 5.1 we have that H n ()Ct,Ai) is isomorphic 



to H n (JCi \Ai). Therefore all the vertical arrows are isomorphisms for 



n > 0. Again, as in Theorem 3.1 one can show that all the squares 



commute since Ai+i\Ai C /C; + i\/C/. Consequently, from Theorem 2.1 
the persistence modules corresponding to lower and upper complexes 
are equal for n > 0. □ 

As for the zero dimensional persistence here exactly the same situa- 
tion as the one presented in Figure [3] occurs. Therefore all the informa- 
tion about zero dimensional persistence is lost. They can be retrieved 
in near linear time by using find union data structure. 

Algorithm [3] summarizes the acyclic subspace method for persistence. 
In this algorithm only the top dimensional elements need to be repre- 
sented. 

The complexity of this algorithm is linear provided the cells of the 
complex and the neighboring cells can be iterated according to the 
filtration levels of the function g. 

6. Smoothing up the data 

Let us fix a complex /C and a filtering function /. The number of 
elements reduced with described algorithms strongly depends on the 
filtering function /. It may happen, especially for noisy data that 
no reduction can be made, because the value of cell and some of its 
faces differs infinitesimally. In this appendix we show a heuristic for 
denoising such data, controlling the changes of persistence. This way 
we increase the effectiveness of the presented reduction methods. This 
idea is based on stability results for persistence [3j. As recalled in 
Equation [IJ stability of persistence states that under the e change (in 
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Algorithm 3 Acyclic subspace algorithm for persistence. 

Input: JC - cell complex with filtration g : JC — > Z; 
Output: A - acyclic subcomplex of JC; 
rain = minimal value of g on JC; 
A = 0, the acyclic subcomplex; 
for every connected component C of JC do 

Pick any top dimensional cell T £ C having minimal filtration value in 
its connected component. Set A = A U {T}; 
for every filtration value i starting from min in increasing order do 
List of cells Q = 0; 
for every T such that g(T) = i do 

if T n A is acyclic and for every element a £ T \ (T n A), g(a) = i 
then 

Q = Q U T; 
■while Q is not empty do 
T = Q.dequeueQ; 
if T n A is acyclic then 
A = AUT; 

for every cell T" incident to T such that g(T') = i and T 1 tfL A 
do 

if T" n A is acyclic and for every element a £ T' \ (T" n .4), 
<?(a) = i then 
Q = Q U T; 



the maximum norm) of the filtering function the persistence diagrams 
will change by no more than e in the so-called bottleneck distance [3]. 
We start with free face collapses and coreductions. One can view the 
denoising procedure as constructing a perturbed function /' : K — > Z 
(we assume that /' = / on all cells in which the perturbation did 
not take place). In this case when performing free face collapsing or 
coreductions the requirement g(A) = g(b) can be relaxed by checking: 

(1) If f(A) - f(b) < e and 

(2) if for every B being a coface of h we have f(B) > f(A) 

then the filtration value of the cell b is perturbed to f(b) = f(A) and 
a reduction of a pair (A, b) can be made when a complex K. with a 
filtering function /' is considered. 

Let us show that /' is a sublevel-complex filtration and ||/ — /'||oo = 
max a( z)c\f(a) — f'(a)\ < e. The fact that /' is a filtration of JC fol- 
lows from point (2) above and the fact that if a filtration value of 
a coface of b, such that f(b) ^ f'(b), is changed then they are in- 
creased. The fact that for every a G JC, \f(a) — f'(a)\ < e follows from 



16 PAWEL DLOTKO AND HUBERT WAGNER 

the condition (1). Therefore, from stability of persistence j3] we have 
d B {H*(lC,f),H p {lC,f'))<e. 

It is known from Discrete Morse theory j8] that maximizing the num- 
ber of reduced elements is NP-complete. Therefore a greedy strategy 
seems to be a viable option. One can perform a greedy reduction 
together with changing the value of the function if the conditions pre- 
sented above are satisfied. We want to stress that once we want to 
change the value of a cell a G /C from f(a) to f'(a), then the inequality 
f(B) > f'(a), for every B coface of a, need to be checked in the initial 
complex (so we are not allowed to disregard the elements that has al- 
ready been reduced). Otherwise the changes will accumulate and the 
e precision will be lost as presented in Figure [6] 

Similar idea can be applied to acyclic subcomplex method for per- 
sistence. Let us assume that the filtration / is given only on top di- 
mensional cells of a complex K. Then / is assumed to be extended 
to other cells by using lower star filtration (i.e. every cell have a fil- 
tration equal to the minimal filtration of the incident top dimensional 
cells). Once one perturbs the values of top dimensional cells (/' is the 
perturbed function) so that for each A G K. being a top dimensional 
cell \f(A) — f'{A)\ < e, and constructed lower star filtration based 
on /' for every a G K we will have \f(a) — f'(a)\ < e. Consequently 
cIb(H p (1C, f),H p (K,, /')) < e. Due to this property one can do two 
things to increase the efficiency of acyclic subcomplex method: 

(1) Reduce the number of filtration levels of /' with respect to / 
as much as possible. This allows a better complexity of the 
algorithm (we remind that the complexity of this algorithm is 
dependent to the number of filtration levels). 

(2) Use greedy strategy when constricting the maximal acyclic sub- 
complex by changing the value of neighboring cells used to con- 
struct locally larger acyclic subcomplex. 

In both cases we face a hard optimization problem. Therefore, the only 
feasible solution would be to use a greedy strategy. 

7. Conclusions. 

In this paper, we adopted existing reduction techniques, which were 
beneficial in homology computation, to the context of persistent ho- 
mology. In many practical cases such a reductions should be used as a 
preprocessing step to avoid storing and processing the entire boundary 
matrix of the initial data. As in the case of standard homology com- 
putation one can use the presented reductions in a sequence - acyclic 
subspace, then elementary collapses and coreductions. 
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FIGURE 6. Example showing that when changing the fil- 
tering function values after some reductions were made, 
one should always check the presented conditions on the 
initial complex, not the reduced one. Otherwise the 
changes could accumulate and one would lose the bounds 
on the distances of the output persistence, as presented 
in this example. On the left the initial complex with the 
initial function values. Filtration is given on top dimen- 
sional cells and lower dimensional cell has the filtration 
equal to minimum of filtration of incident 2-dimensional 
cells. Suppose we want to obtain persistence with a tol- 
erance e = 1. Then, if the reduced elements are for- 
gotten and they are not taken care of in the checking 
f(B) > f'(b) for b being a face of B, then all the reduc- 
tions presented on the right can be made which clearly 
gives an interval [3, oo] in the dimension 1. The correct 
one, [1, oo] is farther than e from the provided answer. 
By enlarging this example the difference can be made 
arbitrarily large. 
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